The Merck Gene Index browser: an extensible data integration system for gene finding, gene characterization and EST data mining

نویسندگان

  • Barbara A. Eckman
  • Jeffery S. Aaronson
  • J. A. Borkowski
  • W. J. Bailey
  • K. O. Elliston
  • A. R. Williamson
  • R. A. Blevins
چکیده

MOTIVATION To make effective use of the vast amounts of expressed sequence tag (EST) sequence data generated by the Merck-sponsored EST project and other similar efforts, sequences must be organized into gene classes, and scientists must be able to 'mine' the gene class data in the context of related genomic data. RESULTS This paper presents the Merck Gene Index browser, an easily extensible, World Wide Web-based system for mining the Merck Gene Index (MGI) and related genomic data. The MGI is a non-redundant set of clones and sequences, each representing a distinct gene, constructed from all high-quality 3' EST sequences generated by the Merck-sponsored EST project. The MGI browser integrates data from a variety of sources and storage formats, both local and remote, using an eclectic integration strategy, including a federation of relational databases, a local data warehouse and simple hypertext links. Data currently integrated include: LENS cDNA clone and EST data, dbEST protein and non-EST nucleic acid similarity data, WashU sequence chromatograms. Entrez sequence and Medline entries, and UniGene gene clusters. Flatfile sequence data are accessed using the Bioapps server, an internally developed client-server system that supports generic sequence analysis applications. Browser data are retrieved and formatted by means of the Bioinformatics Data Integration Toolkit (B-DIT), a new suite of Perl routines.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integration and Reduction of Microarray Gene Expressions Using an Information Theory Approach

The DNA microarray is an important technique that allows researchers to analyze many gene expression data in parallel. Although the data can be more significant if they come out of separate experiments, one of the most challenging phases in the microarray context is the integration of separate expression level datasets that have gathered through different techniques. In this paper, we prese...

متن کامل

Molecular Characterization of the Factor IX Gene in 28 Iranian Hemophilia B Patients

Background: Heterogeneous mutations in the human coagulation factor IX gene lead to an X-linked recessive bleeding disorder known as hemophilia B. The disease is distributed worldwide with no ethnic or geographical priority. Materials and Methods: The aim of this study was to characterize the factor IX gene mutations in 28 unrelated Iranian hemophilia B patients. Polymerase chain reaction (PCR)...

متن کامل

Prediction of Acid Mine Drainage Generation Potential of A Copper Mine Tailings Using Gene Expression Programming-A Case Study

This work presents a quantitative predicting likely acid mine drainage (AMD) generation process throughout tailing particles resulting from the Sarcheshmeh copper mine in the south of Iran. Indeed, four predictive relationships for the remaining pyrite fraction, remaining chalcopyrite fraction, sulfate concentration, and pH have been suggested by applying the gene expression programming (GEP) a...

متن کامل

P-215: Discovery of A Novel APA Variant of A Human Potential Gene Based on Expressed Sequenced Tags Analysis

Background: Expressed sequence tags (ESTs) are sequences of cDNA fragments prepared from different tissue sources. There are over one million of these sequences in the publicly available database, and these sequences are believed to represent more than half of all human genes. The ESTs belong to different cDNA libraries, was prepared from one particular cell type, organ, or tumor. Therefore, th...

متن کامل

Data Mining for Identification of Forkhead Box O (FOXO3a) in Different Organisms Using Nucleotide and Tandem Repeat Sequences

 Background: Deregulation of FOXO3a gene which belongs to Forkhead box O (FOXO) transcription factors, can cause cancer (e.g. breast cancer). FOXO factors have important role in ubiquitination, acetylation, de-acetylation, protein-protein interactions and phosphorylation. Understanding the regulation and mechanisms of FOXO3a can lead to cancer treatment. The aim of this study recent association...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 14 1  شماره 

صفحات  -

تاریخ انتشار 1998